Nested Rollout Policy Adaptation with Selective Policies

نویسنده

Tristan Cazenave

چکیده

Monte Carlo Tree Search (MCTS) is a general search algorithm that has improved the state of the art for multiple games and optimization problems. Nested Rollout Policy Adaptation (NRPA) is an MCTS variant that has found record-breaking solutions for puzzles and optimization problems. It learns a playout policy online that dynamically adapts the playouts to the problem at hand. We propose to enhance NRPA using more selectivity in the playouts. The idea is applied to three different problems: Bus regulation, SameGame and Weak Schur numbers. We improve on standard NRPA for all three problems.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Nested Rollout Policy Adaptation for Monte Carlo Tree Search

Monte Carlo tree search (MCTS) methods have had recent success in games, planning, and optimization. MCTS uses results from rollouts to guide search; a rollout is a path that descends the tree with a randomized decision at each ply until reaching a leaf. MCTS results can be strongly influenced by the choice of appropriate policy to bias the rollouts. Most previous work on MCTS uses static unifo...

متن کامل

Beam Nested Rollout Policy Adaptation

The Nested Rollout Policy Adaptation algorithm is a tree search algorithm known to be efficient on combinatorial problems. However, one problem of this algorithm is that it can converge to a local optimum and get stuck in it. We propose a modification which limits this behavior and we experiment it on two combinatorial problems for which the Nested Rollout Policy Adaption is known to be good at.

متن کامل

Application of the Nested Rollout Policy Adaptation Algorithm to the Traveling Salesman Problem with Time Windows

In this paper, we are interested in the minimization of the travel cost of the traveling salesman problem with time windows. In order to do this minimization we use a Nested Rollout Policy Adaptation (NRPA) algorithm. NRPA has multiple levels and maintains the best tour at each level. It consists in learning a rollout policy at each level. We also show how to improve the original algorithm with...

متن کامل

Improved Diversity in Nested Rollout Policy Adaptation

For combinatorial search in single-player games nested MonteCarlo search is an apparent alternative to algorithms like UCT that are applied in two-player and general games. To trade exploration with exploitation the randomized search procedure intensifies the search with increasing recursion depth. If a concise mapping from states to actions is available, the integration of policy learning yiel...

متن کامل

Distributed Nested Rollout Policy for SameGame

Nested Rollout Policy Adaptation (NRPA) is a Monte Carlo search heuristic for puzzles and other optimisation problems. It achieves state of the art performance on several games including SameGame. In this paper, we design several parallel and distributed NRPA-based search techniques, and we provide number of experimental insights about their execution. Finally, we use our best implementation to...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2016

Nested Rollout Policy Adaptation with Selective Policies

نویسنده

چکیده

منابع مشابه

Nested Rollout Policy Adaptation for Monte Carlo Tree Search

Beam Nested Rollout Policy Adaptation

Application of the Nested Rollout Policy Adaptation Algorithm to the Traveling Salesman Problem with Time Windows

Improved Diversity in Nested Rollout Policy Adaptation

Distributed Nested Rollout Policy for SameGame

عنوان ژورنال:

اشتراک گذاری